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[57] ABSTRACT 

A page-accessing method in a segmented tablespace 10 
which eliminates unnecessary reading and locking. The 
tablespace comprises data pages 18 grouped into identi- 
cally-sized segments 16, each segment storing data for a 
single table. A status indicator 26 for each data page of 
a segment is kept in a separate segment control block 20 
stored on a space map page 14. Five data page status 
indicator values are maintained: 

(1) FULL (26a) - entirely full of current data; 

(2) PARTIALLY FULL (266) - partially full of 
current data; 

(3) UNFORMATTED (26c) - empty; contains no 
data; 

(4) MASS DELETE (26d) - contains only obsolete 
data because of an unqualified deletion (mass de- 
lete) of data; or 

(5) QUALIFIED DELETE (26*) - contains only 
obsolete data because of a qualified deletion of 
data. 

When scanning over the data in a segment, UNFOR- 
MATTED and MASS DELETE pages are skipped. 
QUALIFIED DELETE pages are locked, and then 
skipped if they still contain only obsolete data when the 
lock is obtained. When inserting data into pages, prelim- 
inary reads of UNFORMATTED and MASS DE- 
LETE pages are avoided. Data integrity is ensured by 
placing integrity checking bits at the beginning and end 
of each page. If the bits contain the same value, the 
page's data integrity is intact. If the bits are different, 
the page's previous consistent contents are recovered 
from the log records. 

6 Claims, 3 Drawing Sheets 
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action or operation can be begun again. Once a transac- 

METHOD FOR MINIM IZING LOCKING AND tkm or operation has been completed and its changes to 

READING IN A SEGMENTED STORAGE SPACE the data can be made permanent, the transaction or 

operation is "committed". After its commitment, a 

BACKGROUND OF THE INVENTION 5 transaction or operation cannot be undone. 

1 . Technical Field One important class of data base management systems 
This invention relates to computerized data process- <* relational data base management systems. In a rela- 

ing systems, and more particularly to the deletion, lock- tional data base management system, data is perceived 

ing, and reading of data stored in segmented storage. to exist in one or more tables or "relations", each con- 

2. Description of the Prior Art 10 aisting of a specific number of columns and a variable 
Computerized data base management systems rely on number of unordered rows or "records". The advan- 

very high-speed central processing units for the manip- tage of relational data base management systems is that 

ulation and processing of data, and on much slower their data can be accessed by referring to its content 

physical storage devices for permanent storage of the instead of its specific organization or location in storage, 

data. When physically stored data is required for pro- 15 In the past, the deletion of the entire contents of a 

ceasing it is transferred from the storage device to a large data entity, such as a relational data base table, a 

buffer, where it can be read, updated, or otherwise process also known as a mass delete or an unqualified 

processed If alter processing the page's contents in the delete, has required a great deal of time for large data 

buffer have been updated, the page is copied back to entities* This is because the mass delete process must 

physical storage to replace the data which was previ- 20 delete the entity's records (rows) one by one, writing a 

ously stored. Buffers can reside in either high-speed ( 0 g record for every deleted record. 

electronic storage (real storage) such as the main mem- _ r n 

ory of a central processing unit (CPU), or in high-speed SUMMARY OF THE INVENTION 

physical storage devices (paging devices). One object of this invention is to minimize the 

Each transfer of a page of data from physical storage 25 amount 0 f buffer paging in data base management sys- 

to a buffer, a process known as "buffer paging", re- temS) by reducing the number of pages read, and by 

quires either about 2 milliseconds (ms) or 20 ms, de- reducing page locking. 

pending on whether the page was accessed sequentially Another object of this invention is to reduce the time 

( imme d iatel y after the preceding stored page) or ran- required for the mass deletion of a relational data base 
domly. In typical large data bases having millions of *> 

records of data distributed over perhaps 100,000 data md m ^cved by methods of 

pages, randomly reading the entire data base would readin ^ baert ^ data into pages in a physical stor- 

require over one-half hour for buffer paging ^ Any * m whic h*each data page contains only data 

reduction in the number of data pages required to be gfT^ ^ ratity ^ m which the status of each 

paged into the buffers ^tes^^ect, secant 35 fa ^ m ^ 

miprovement in the data base s r^ormance. |« there- Xthe stora^ceis locked diirhig deletion of all of 

fore desirable to mmtmrze buffer paging by eliminating ^^ 8 ^Sto a data entity. In both methods, 

tss:?^ , used ^n^ 8 i y 2 

in data base inanagement systems to allow a data base to 40 deletion of all of fe data entity s data are treated the 

be concurrently accessed or updated by multiple users « - » pages containing no data. 

or programs. Using page level locking, each page that is ( *e nwthod for reading, pages, pag« contauung at 

acJesse^updatedbya user or program is "locked" to some current data are bo*locked and read Pages 

prevent simuhaneous access or updating by any other containing only obsolete data because of the deletion of 

user or program. Locking is an^ensive operation 45 less than all of the data entity's data are locked, and after 

which uses much CPU time, and may seriously reduce the lock is obtained they are read only if they then 

the system's performance and response time. It is there- contain current data. Pages containing no dat* or oiUy 

fore desirable to rninimizc locking without sacrificing obsolete data due to the deletion of all of the data cnti- 

the concurrency control required for effective use by t/s ncithcr locked . nor / ead : 

multiple users. 50 In the method for inserting data into pages, pages 

W. Chu et id., •'Fault Tolerant Locking for Tightly containing current data and pages containing only obso- 
Coupled Systems", Proceedings of the 3th Symposium lete data because of a deletion of less than all of the data 
on Reliability in Distributed Software and Database entity's data, are locked and read before data is inserted 
Systems, Computing Society Press, copyright into them. Data is inserted immediately into pages con- 

1986, pp. 49-55, disclose a fault-tolerant locking proto- 55 taming no date or containing only obsolete data because 
col in which a lock word containing the status of a of the deletion of all of the data entity's data, without 
group of records is appended to that group. The proces- first reading such pages. ? 
sor consults a status table before flrrrrrirg the group. Other features and advantages of this invention will 
Statuses recorded are: free, locked, update initiated, or become apparent from the following detailed descrip- 
failed. 60 tion of its preferred embodiment, taken in conjunction 

Data base management systems use log records to with the accompanying drawings, 
allow failed i transactions or operations to be undone. BRIEF DESCRIPTION OF THE DRAWINGS 
This capability is critical because if the transaction or 

operation fails midway through its completion, some FIG. 1 shows a segmented tablespace as required for 
data will have been changed while other data will re- 65 the method of the preferred embodiment of this inven- 
main in its previous state. The log records written up to tion. 

the point of the failure can be used to restore or roll FIG. 2 shows a single segment of the segmented 
back the table to its original condition, so that the trans- tablespace of FIG. 1 and its associated space map page. 
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FIG. 3 shows the format of a segment control block 
of the space map page of FIG. 2. 

DESCRIPTION OF THE PREFERRED 

EMBODIMENT 5 

The Segmented Tablespace 

The contents (rows) of relational data base tables are 
normally stored in physical storage in "tablespaces". To 
form a tablespace, one or more areas of physical storage 
are logically concatenated together to form a single, 10 
linear addressing range (ic, starting from zero to 64 
gigabytes). The tablespace is divided (formatted) into 
pages of fixed size, and all pages within the tablespace 
are uniquely identified by their relative page number. 
Each tablespace is associated with a group or pool of 15 
buffers in which the buffer size is equal to the page size 
of the tablespace. 

A segmented tablespace is a particular type of tables- 
pace distinguished by its scheme of internal data organi- 
zation and space management Segmented tablespaces 20 
typically include a header page, space map pages, and 
data pages. The data pages store the actual data, while 
the header and space map pages provide a framework 
for rapidly accessing the data pages. 

The data pages are grouped into segments which are 25 
uniquely identified by their relative segment numbers. 
The segments are of equal size and form the fundamen- 
tal unit of storage in the tablespace. All of the data pages 
in a given segment contain data for that segment's table 
only. As tables require additional storage, more seg- 30 
ments are assigned to them. 

The segments are themselves grouped, and each seg- 
ment group is preceded by a space map page which 
contains entries known as segment control blocks for 
each of the space map page's segments. The segment 33 
control blocks for each table are chained together by 
pointers. The header page identifies the space map 
pages in the same way that the space map pages identify 
the segments, creating a hierarchical storage structure 
within the segmented tablespace. 40 

FIG. 1 shows a typical segmented tablespace 10 in 
which the methods of this invention are particularly 
useful. The tablespace is made up of a large number of 
pages of equal size. The first page in the tablespace 10 is 
the tablespace header 12, of which there is one for the 45 
entire tablespace. The tablespace header 12 contains the 
tablespace identifier, and other information related to 
the tablespace as a whole. The remainder of the tables- 
pace consists of space map pages 14, each followed by 
a number of segments 16. Each segment 16 consists of a 50 
number of data pages 18. Only the data pages 18 contain 
data from rows of tables. The space map pages and the 
header page provide a hierarchical organization which 
facilitates access to data stored on individual data pages. 

A space map page 14 and its succeeding segments 55 
16a, 16b, ... are shown in FIG. 2. The space map page 14 
includes a list of segment control blocks 20, and a 
pointer 22 to the next space map page. The segments 
16a, 166, ... following the space map page 14 contain a 
number of data pages 18, each of which stores the data 60 
for several rows of the table. Because the tablespace 10 
is segmented, ail of the data pages 18 within a given 
segment 16 contain rows for the same table. Adjacent 
segments 16 may contain data for the same or different 
tables. Each segment 16 has a corresponding segment 65 
control block 20. 

A typical segment control block 20 is shown in FIG. 
3. Each segment control block 20 includes a table identi- 



fier 24, a segment status indicator 25, a number of data 
page status indicators 26, and a pointer 28 to the next 
segment control block for the same table. These point- 
ers 28 create chains of segment control blocks 20 be- 
longing to the same table, as seen in FIG. 2. The seg- 
ment control block 20 for a given segment 16 contains a 
respective one of the data page status indicators 36 for 
indicating the status of each data page 18 within that 
segment. Five difTerent values (26a-e) of this status 
indicator are: 

(1) FULL (26a) - the data page is entirely full of 
current data; 

(2) PARTIALLY FULL (266) - the data page is 
partially full of current data; 

(3) UNFORMATTED (26c) - the data page is unfor- 
matted, containing no data; 

(4) MASS DELETE (264) - the data page contains 
only obsolete data as a result of an unqualified or 
mass deletion of a table; and 

(5) QUALIFIED DELETE (26e) - the data page 
contains only obsolete data as a result of a qualified 
deletion in which less than all of the data in a table 
was deleted. 

A preferred method for mass deleting all of the rows 
of a data base table is described next. 

Mass Delete in a Segmented Tablespace 

In a segmented tablespace such as that shown in 
FIGS. 1 and 2, a new method for efficiently performing 
a mass delete of an entire table has been discovered. 
This new mass delete operation docs not access, lock, or 
log the data pages 18 of the table. Instead, the mass 
delete operation only scans the segment control blocks 
20 of the subject table, marking their respective segment 
status indicators 25 as FREE (deallocated). The first 
segment 16 of the table is not deallocated in order to 
allow for subsequent insertion of data. Logging is done 
for the deallocated space map pages 14 to ensure the 
mass delete can be backed out (undone). 

The segment control block's page status indicators 26 
are set to MASS DELETE to show that those pages 
have been deleted as part of a mass delete operation. 
When the segments 16 containing these data pages are 
reallocated, the page status indicators 26 are not 
changed to identify them to the data base management 
system. This is necessary because special action must be 
taken before new data is inserted into mass-deleted data 
pages 18. The MASS DELETE page status indicator 
26d indicates that the data page is logically empty. 
Mass-deleted pages thus contain no active or current 
data —only obsolete data which was not erased during 
the mass delete operation. By avoiding the erasure of 
obsolete data, the time required for a mass delete opera- 
tion is reduced tremendously since page accesses and 
log records are not required for the data pages. 

Table 1 contains an illustrative pseudocode imple- 
mentation of the mass delete operation of this invention. 

TABLE 1 

Pseudocode for Mm Delete la Segmented Tablespace 

101 Lock the cable In exclusive mode. 

/— prevent other Booosrs to the same —/ 
/« table »/ 

102 Acquire a "mass delete" type lock for the table to 
prevent uncommitted free segment* from being reuied by 
any agents including itself. 

/- thu lock will be released when the «/ 
/m mass delete operation U committed «*/ 
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TABLE 1 -continued empty page contains only binary zeroes - no data at all. 

Somewhat similarly, a "logically" empty page contains 



P*^^^^^ 1 ^ no valid data, but is not reset to binary zeroes. A logi- 

103 ?°l^^^ tC ^^!t^ Se cally empty page can be caused by a mass delete (Le., 

»«8mau -/ 5 unqualified delete of aD the data in a table) or by a 

104 Aooea tfac space cup page u ™™t,;ning the qualified delete of less than all the table's data. Thus, 
segment coctroi block 20 for the segment 16. physically empty pages contain no data, logically empty 

103 if it b the ftm segment 16 of the table then. P a «« only obsolete data, and all other pages 

106 write a log record of the before and after partially or fully contain current (i.e., active or valid) 
images of the segment control block 20. 10 data. 

107 ^^d^fwj^j?^ To achieve this objective, each data page's status 

the segment control block 20 to MASS DELETE. _ , t . * . r Al _ «o-. i* .u . 

must be stored separately from the page 18 itself, so that 

108 else. /» not the first segment of the table =/ the page need not be read if it is empty. This indication 

109 h^^^a^ ml ^^^i^ arta is stored in the segment control block 20 as that page's 
no c^c ^^l^Z^^xo 15 respective one of the status indicators 26. One of the 

FREE to identify the segment a » on- five status indications described above should be stored 

deleted free segment without reletting for each data page 18. Each status indication corre- 

tUcc^bk^t^t^t^ui^^ sponds to a unique state of the data page. 

The first status indication FULL 26a is set when no 

the segment control block 20 to mass DELETE. 20 more data can be added to the page 18. During read 

m rnn r onp /_ .n »«m~m i« of ,k. tw^ operations, these pages are first locked and then read. 

12 ™> WP/- prolan .cgnam 16 of the tabled full page, are ignored during insert operations. 

The second status indication PARTIALLY FULL 
It is essential that no other transaction or operation be 26a is set when a page contains at least some current or 
performed on a table being mass deleted. Line 101 of ^ valid data. During a read operation, these pages are first 
Table 1 ensures this by locking the table in exclusive locked and then read normally, like FULL pages. For 
mode, thereby preventing other users or programs from variable length rows (i.e., rows of tables which contain 
accessing any data in that table. At the same time, in line at one variable length column), this second status 
102 a "mass delete" type of lock is acquired to prevent indication can further be subdivided into multiple set- 
uncommitted (not yet permanent) free segments from *> to morc accurately represent the amount of free 
being reused by any user or application program includ- space lcft on ^ 4^ p™ thereby improving the utili- 
ing the inaas delete operation. This inass delete lock is ^tkm of the physical storage device by packing more 
thus acquired in exclusive mode, and will be released rows into the data pages. 

onlywhen the mass delete operation is rjermancnt and ^ The third status indication UNFORMATTED 26c is 

co ?? nitted " . - . . set for physically empty pages, i.e. pages 18 which are 

The delete operation then accesses each seg- ^ ^ guaranteed to be only binary 

mem of the table by following the table's chain of seg- „ Kr . mm ^ ma * aM 1A Je r*m~t*-A a ii 

ment control block! The space map pages 14 are ac- a 8p *^ J m f p J** 5 14 u ^ fma ^& 

, u . ";f , uap pyw 7 , Jr data pages 18 covered by that space map are set to this 
cessed as required for each segment 16 of the table. The ™* iSZ^t^m^^J^^^nu^t rw„ 0 » ™h 
first segment 16 of the table ^handled slightly differ- « * for *i£*? jfff L£2 * utdued ' ErurtoB a read 
ently^Sie succeeding segments (lines 1&-107). For SS!S^^^S^USS^ 'whet 

both, a log record is w^ SSJE^S^ 
unages of the segment control block 20 of the segment 00 ~ w . ™ u uu * c ^ cu mw ™T , . \S ~!1 
If the segment is^otthe first segment on the chain (lines ^ s^ce there is no preexisdng data to be concerned 
108-llirthe segment controlWock is marked as a 4S about However, the requested page is logged, and an 
mass-deleted free segment, although the segment's table m W ^ t0 " rocs * ^ . t0 thc 

identifier 24 is left unchanged. A nonzero segment table P*8 C and it is formatted as a data page containing no 
identifier 24 is used to indicate that the segment cannot data. — , . 

be reallocated while an uncommitted mass delete is in ^ fourtn ****** indication MASS DELETE 26d is 
progress for the table previously associated with that 50 when a page 18 becomes logically empty because of 
segment The final step for each segment, in lines 107 * niass delete (or drop table) operation. This state indi- 
and 111 of Table 1 is to mark all data pages of the seg* cates that the contents of the, page can be ignored even 
ment as having been dr1rtrd t for use when those though the page has not been reset to binary zeroes, 
pages are allocated to other tables. The DO-loop be- Pages with this status indication contain no valid data, 
tween lines 103 and 112 is repeated for every segment of 55 As described above for Table 1, the mass delete opera- 
the table by following the chain of segment control tion changes all the page status indicators 26 of the 
blocks 20. After the last segment has been processed, table's segment control bloc ks 20 to the fourth status 
the mam delete operation is complete and can be com- ind i ca ti on MASS DELETE. During a read operation, 
nutted. MASS DELETE pages are neither locked nor read 

A drop table operation, in which the table is not 60 since they contain no current data. There is no danger 
retained for subsequent insertion of data, operates the that a concurrent mass delete will be uncommitted and 
same as a mass delete except that the first allocated undone because an exclusive lock on the table is held by 
segment for the table is also deallocated, since no fur- the mass delete operation so that another user or appli- 
ther data will be inserted into thc table. cation cannot start a read operation for the table while 

Minimizing Locking and Reading of Empty Pages 65 the mass delete is being performed. During an insert 
The principal objective of this invention is to attempt ope ration , a MASS DELETE page (Like an UNFOR- 
to ininimize locking and reading of data pages 18 that MATTED page) is not first read, since it contains only 
are physically or logically empty. A "physically" obsolete data. However, before any data is inserted into 
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the page, an empty buffer is assigned and .the page is 
formatted. In this formatting process, the page is first 
reset to binary zeroes, and then the page header and 
trailer are set to indicate the page c ontain s no dat a. 

The fifth status indication QUALIFIED DELETE 5 
26e is set when a page 18 becomes logically empty as the 
result of a predicate or qualified delete. This type of 
delete statement is suffixed by conditions which usually 
cause only some of the table's records to be deleted. 
During read operations, QUALIFIED DELETE 10 
pages are First checked to ensure that their logically 
empty state was not caused by an uncommitted delete. 
This checking is done by requesting, waiting for, and 
finally acquiring a lock on the page. After the lock is 
acquired, the read operation skips over the page if it is 13 
still logically empty and contains only obsolete data. 
When data is to be inserted into a QUALIFIED DE- 
LETE page, the page will be read. 
Relation Scan in the Segmented Tablespace 
Table 2 shows a pseudocode implementation of a 20 z» 
type of read operation known as a relation or tabic scan 
according to the preferred embodiment of this inven- 
tion. A table scan sequentially reads a number of pages 
of a table's data. 
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TABLE 2-continued 



Pseudocode for TABLE Son m Segmented Tablespace 



221 
222 



223 
224 



225 



226 
227 



228 



229 



TABLE 2 



23 



Pseudocode for TABLE Scan io Segmented Tabletpacc 



201 



202 



203 



204 



203 



206 



207 



209 
210 



DO LOOP for each segment 16 of the table. 
/= follow the table's chain of segment control =/ 
/— blocks 20 between and across space map — / 
/» pages H 



Access the space map page 14 for the segment 16, 
and read the segment's segment control block 20. 

DO LOOP for each data page 18 of the current 
segment 16 

/=■ follow the list of data page status =/ 
/=> mtttcators 26 within the segment «/ 
/- control block 20 -/ 

SELECT the one of the data page status indicators 26 for 
thee 



30 



35 



231 

232 
233 
234 

235 

236 
237 
238 
239 
240 

241 



40 



242 



243 



212 
213 
214 
215 
216 



217 
218 



219 



220 



CASE (Indicator ts "UNFORMATTED **or "MASS 
DELETE") 

/= page is physically empty and not »/ 
/= yet formatted, or is logically =/ 
/= empty because of mass delete =/ 
DO nothing. 

/= page will not be locked or =/ 

/=> read =/ 

END. 

CASE (Indicator is "QUALIFIED DELETE") 
/— page m logically empty because of — / 
/*=> a qualified delete =»/ 
DO. 

IF table level lock is held by the table 

scan operation THEN 

/a no other user or application 

/— can change the contents of «-/ 

/a this data page »/ 

DO nothing. 

/— page will not be locked — / 
/— and read —/ 

END. 

ELSE. /« page level locking is active -/ 
DO. 

Lock the page In shared mode. 

IF page becomes non-empty THEN 

page contains some =>/ 
/a curreni data ■/ 
DO. 

IF the data page is not 
already in a bufTer THEN 
Assign s buffer to the 
data page. 

Access the data page uid 



read it 
ELSE 

Read the page In the 

buffer. 

END. 

ELSE. /= page ii still -/ 

/= logically empty, -/ 

/ = containing only =/ 

/** obsolete data -/ 

DO nothing. 

/« page will not be »/ 

/- read -/ 

END 

END. 

CASE (indicator is "PARTIALLY FULL** or 
"FULL") 

/» page con tains at least some =■/ 

/ cs current data =/ 

DO. 

IF table level lock is held by the table 

scan operation THEN 

/_ Shared page lock is not — / 

/=» needed because table level «/ 

/= lock is held. =/ 

DO nothing. 

/= page will not be locked =/ 
END. 

ELSE /«■ page level lock is required = / 
Lock the page in shared mode. 

IF the data page is not 
already in a bufTer THEN 
Assign a buffer to the data page. 
Access the data page and read it 
ELSE. 

Read the page in the bufTer. 
END. 

END SELECT. /=* fhnshed evaluating =/ 
/= status indicator of — / 
/• current page =/ 

END LOOP. /« repeat inner loop for next data =/ 
/» page -/ 

END LOOP, /*» repeat outer loop for next segment =*/ 
/- of table -/ 



The relation scan shown in pseudocode in Table 2 
essentially consists of an outer DO-loop over the seg- 

45 ments 16 of the table, an inner DO-loop over the data 
pages 18 of each segment, and within the inner DO-loop 
a SELECT structure having three alternate CASE 
statements at lines 205, 208, and 223 for processing the 
data page according to the value in its status indicator 

50 22 in the segment control block 20 of the space map 
page 14. 

The outer DO-loop between lines 201 and 243 pro- 
cesses each segment 16 of the table by following the 
table's chain of segment control blocks 20 through the 

55 space map pages 14. Each space map page 14 having a 
segment control block 20 for a segment 16 of the table 
is accessed at line 202. 

The inner DO-loop between lines 203 and 242 pro- 
cesses every data page 18 in the current segment 16, 

60 since the segmented tablespace requires each segment 
to contain only data for a given data entity such as a 
table. The inner DO-loop reads through the list of data 
page status indicators 26 of the segment control block 
20, and processes each indicator's corresponding data 

65 page 18 according to the indicator's value using a SE- 
LECT structure. 

The SELECT structure between lines 204 and 241 
includes three alternate CASE statements at lines 205, 
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15 



203, and 228 for processing the different values of the 
data page status indicators 26. The first CASE state- 
ment at line 205 causes the table scan operation to skip 
over physically empty page s having status indicator 
values of "UNFORMATTED", and data pages which 
are logically empty because of a mass delete and there- 
fore have an indicator value of "MASS DELETE". 
Because both of these types of data pages 18 contain no 
valid or current data, they can be ignored. Thus, in lines 
206 and 207 nothing is done, i.e., the page is not locked 
or read. 

The second CASE statement at line 208 handles data 
pages 18 which are logically empty because of a quali- 
fied delete. If the relation scan operation has acquired a 
table level lock over the entire table, no other user or 
application can operate concurrently on that table. Ac- 
cordingly in lines 210-212, "QUALIFIED DELETE" 
pages may be ignored since they contain only obsolete 
data and their content cannot be changed by a concur- 
rent user or application. Again, an unnecessary lock and 20 
read access is avoided. If the relation scan does not have 
a table level lock, and page level locking is active in- 
stead, lines 213-227 are executed to request a shared 
lock on the page (line 215) and if the page contains at 
least some current data when the lock is obtained (foe 25 
216) to access and read the page (lines 217-223). If when 
the lock is obtain the page is still logically empty and 
contains only obsolete data (line 224), then the page can 
safely be ignored and nothing is done, again avoiding 
unnecessarily locking and reading a logically empty 30 
data page. 

The third CASE statement at line 228 captures 
"PARTIALLY FULL" or "FULL" values of the data 
page status indicators 26, which identify pages 18 which 
contain at least some current data. Unless the table scan 33 
operation has acquired a table level lock on the table 
(lines 230-232), the page is locked in shared mode (lines 
233-234). Then if it is not already in a buffer, the data 
page is accessed and read. 

The SELECT structure ends on line 241, and the 40 
inner DO-loop (lines 203-242) is repeated for the next 
data page 18. When the last data page of the current 
segment 16 has been processed through the inner DO- 
loop, the outer DO-loop (lines 201-243) is repeated for 
the next segment 16 of the table until the entire table has 
been scanned. 

Insertion Into the Segmented Tablespace 

Table 3 shows a pseudocode implementation of a 
method for inserting data into a data page 18 in a seg- 
mented tablespace 16 according to this invention. 

TABLE 3 
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TABLE 3-continued 



Pseudocode for Insert in ScgiiHuitcd TanfeTpace 



45 



50 



Pseudocode for Insert hi Segmented Tablespace 

301 DO LOOP for each segment of the table until a 
oon-FULL page b located. 

/— follow the table*a chain of segment control »/ 
/«- blocks between and across space mis pages »/ 

302 Access the space map page for the segment, and 
read the segment's segment control block. 

303 DO LOOP for each data page of the current segment 
until a noo-FULL data page is located. 

/- follow the list of data page states -/ 
fm indicator* within the segment =»/ 
/b control block =»/ 

304 IF page's status indicator is not "FULL" THEN 
/«> page can receive more data «■/ 

305 IF table level lock is held by the 
insert operation THEN 



55 



60 



306 DOfwdmig. 

307 END. 

308 ELSE. 

309 DO. 

310 Lock the page conditionally in 
exclusive mode. 

311 IF lock is acquired THEN 

312 EXIT from inner and outer 
loops, /= proceed with =/ 
/— insertion of data — / 

313 ELSE. 

314 DO nothing. 

/« continue through — / 
/m inner and cuter »/ 
/- loops -/ 

315 END. 

316 END. 

317 ELSE. /— no room for more data to be =/ 
/= inserted «*/ 

318 DO nothing. 

/— continue through inner and =/ 
/«= enter loops ■»/ 

319 END. 

320 END DO for each data page of the current segment 
/ » repeat inner loop for next data page ■»/ 

321 END DO for each segment of the table 
J— repeat outer loop for next segment =/ 

/= allocate a new segment; =«/ 

322 [F no non-FULL page is found THEN 

/= no room in existing segments - add a new one =/ 

323 DO. 

324 Allocate a new segment for the table. 

325 IF table level lock is not held by the insert 
operation THEN 

326 Lock the first page of this new segment 
in exclusive mode, 

327 Add this new segment to the end of the table's 
segment chain. 

328 END. 

329 END IF. 

/= format or read the page as necessary -/ 

330 DO. 

331 SELECT the data page status indicator for the current 
data page. 

332 CASE (indicator is "UNFORMATTED") 
/= page is physically empty and not yet =/ 

/= unformatted and need not be accessed and »/ 
. /» read =/ 

333 DO. 

334 Assign an empty buffer (set to binary zeroes) to 
the page. 

335 Write a format log record with a REDO ONLY log 
indicator. 

336 Format the page as containing no data. 

337 END. 
/— continue with the log and insert at «=»/ 
/- 353-364 »/ 

338 . CASE (indicator is "MASS DELETE") 

/» page is logically empty because of a mass =/ 
/at delete, and need not be accessed and read -/ 

339 DO. 

340 Assign an empty buffer (set to binary zeroes) to 
the page. 

341 Write a format log record with 

. a REDO ONLY log indicator, and 
• a NON-ZERO PHYSICAL PAGE log indicator. 
/«= for use during a system restart to — / 
/- check for inconsistent stored data -/ 

342 Format the page as containing no data. 

343 END. 

/- continue with the log ad insert at -/ 
/« 353-364 -/ 



65 344 CASE (indicator is '^QUALIFIED DELETE" or 
"PARATIALLY FULL") 
/» page is logically empty because of a ■»/ 
/- qualified delete or is partially full of -/ 
/=■ current or valid data, and must be read in =/ 
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Pieodocode far linen in Segmented Tablcapaoe 



345 
346 
347 
348 
349 
350 
351 



332 



333 
334 



333 
356 



337 
338 

339 

360 
361 

362 

363 

364 



/«=> order to log its before imay* 
DO. 

IF the data page a oot already in a buffer THEN 
Assign a buffer to the data page. 
Access the data page and read it 
ELSE. 

Read the page in the buffer. 
END. 

/= continue with the log and insert at 
/= 353-364 =/ 

END SELECT. /= finished formatting and reading, =/ 
/« which depend on the page's status =/ 
/«= indict tor «/ 



/b log the tn a n t h m and insert the new data =■/ 

DO. 

Write a log record of the before and after images 
of the data page, 

Insert the new data into the data page. 
END. 

/= update the data page's status hvtintnr »/ 
DO. 

IF the data page ii now full THEN 
/=» page cannot receive more data =/ 
Set page's status indicator in segment control 
block to "FULL" 

ELSE. /=* page can soil receive more data =/ 
Set page's status indicator in segment control 
block to "PARTIALLY FULL" 
Write a log record of the before and after images of 
the space map page. 

Update the segment's segment control block on the 

space map page. 

END. 
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25 



30 



The pseudocode implementation in Table 3 of an 
insert operation has two major parts: a set of two nested 33 
DO-loops (lines 301-321) for locating a data page 18 
into which new data can be inserted; and a subsequent 
series of modules (lines 322-329, 330-352, 353-356, and 
357-368) for locking, formatting, reading, logging, and 
inserting data into the page and for updating the page's 40 
status indicator 26 in the segment control block 20 on 
the space map page 14. 

The nested DO-loops between lines 301 and 321 lo- 
cate the first available data page 18 into which the new 
data can be inserted. The outer loop between lines 301 45 
and 321 loops over the segments 16 of the table by 
following the table's chain of segment control blocks 20 
along the space map pages 14 of the tablcspace. The 
inner DO- loop between lines 303 and 320 reads down 
the list of data page status indicators 26 of the segment 50 
control block 20 until a nonfuil data page 18 is located. 
When a status indicator 26 other than "FULL" is found 
(lines 304X the DO-loops are exited and the series of 
modules for locking, formatting, reading, etc. are be- 
gun. If all the data pages 18 in the segment 16 are indi- 55 
cated as being full (line 317), the inner DO-loop is exited 
and the outer loop repeated for the next segment 16 
until a nonfuil data page 18 is located If no non-FULL 
page is found, a new segment is allocated to the table 
(lines 322-329). 60 

Once the data page 18 has been locked, it is reformat- 
ted in lines 332-343 or read in lines 344-351 as neces- 
sary. The action taken depends on the value of the 
page's respective one of the status indicators 26. Pages 
18 having "UNFORMATTED" and "MASS DE- 65 
LETE" indicators 26 are reformatted without being 
first read, while pages having indicators "PAR- 
TIALLY FULL" or "QUALIFIED DELETE" must 



be read without being reformatted. A SELECT struc- 
ture between hues 331 and 352 contains three alternate 
CASE statements (lines 332, 338, and 344) which pro- 
cess "UNFORMATTED", "MASS DELETE", and 
"PARTIALLY FULL" or ^QUALIFIED DELETE" 
status indicators 26, respectively. 

If the page is UNFORMATTED, it does not need to 
be read before data is inserted. Accordingly, an empty 
buffer (set to binary zeroes) is assigned to the page (line 
334), a format log record is written with the log indica- 
tor REDO ONLY (line 335), and the empty buffer is 
then formatted as a page containing no data (line 336). 
The buffer is then ready to have the new data inserted 
and to then be logged and written to physical storage. 

MASS DELETE pages containing only obsolete 
data as a result of a mass delete are processed in lines 
338-343. Like unformatted pages, mass-deleted pages 
are not read. Instead, an empty buffer set to binary 
zeroes is assigned to the page, and a format log record 
is written (lines 340-341). The format log record writ- 
ten when a mass-deleted page is reformatted includes 
the REDO ONLY log indicator plus a NONZERO 
PHYSICAL PAGE log indicator. The NONZERO 
PHYSICAL PAGE log indicator is used during a sys- 
tem restart to check for inconsistent stored data, as 
described below under the heading "Data Integrity". 
After the format log record has been written, the now 
empty buffer is formatted with the header and footer for 
a page containing no data (line 342), leaving the page 
ready for the logging and insertion module at lines 
353-364. 

If the page 18 into which the new data will be in- 
serted is PARTIALLY FULL or QUALIFIED DE- 
LETE (logically empty because of a qualified deletion 
leaving the page containing only obsolete data), the 
page 18 must not be reformatted but must instead be 
accessed and read so that a "before image" of its preex- 
isting contents may be logged. The CASE statement at 
line 344 identifies these pages, which are accessed and 
read in lines 345-351, and then processed in the logging 
and insertion module at lines 353-364. 

Once the data page 18 has been reformatted or read as 
necessary, the logging and insertion module of lines 
353-364 writes a tog record of the before and after 
images of the page and inserts the new data into the 
page. The insertion at line 355 includes writing the new 
contents of the page 18 to physical storage. 

The final step of the insertion operation is to update 
the data page's respective one of the status indicators 26 
in its segment control block 20 of the space map page 14 
(lines 357-364). If the new insertion has filled the page 
so that it cannot receive any more data (line 358), the 
page's status indicator is set to "FULL" (line 359). Oth- 
erwise, the page's status indicator 26 is set to "PAR- 
TIALLY FULL" (lines 360-361). A log record of the 
before and after images of the space map page is written 
at line 362, and the segment control block 20 containing 
the revised data page status indicator 26 is updated on 
the space map page 14 (line 363). This updating includes 
transferring the updated space map page 14 to physical 
storage. 

Data Integrity in the Segmented Tablespace 
A data base management system must prevent a user 
or application from accessing inconsistent data. This 
need for data integrity is an essential requirement for a 
usable data base management system. A data page 18 
can contain inconsistent data if the system crashes while 
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the page is being transferred from a buffer to physical TABLE i continued 

storage, leaving the first part of the physically stored 



page containing updated data and the last part of the iwkxxxte for swing Dm integrity 

page containing obsolete data. physical itontge then 

The preferred method for detecting this condition is 5 /^m* the integrity bha -/ 

to maintain two integrity checking bits: one at the be- ah towrt the integrity wt» 

ginning of the page and one at the end. In a page con- /- i.c change both bits to ON* if they -/ 

taming consistent data, these two bits will have the same /sa w currently act to off*, ud to on if »/ 

value (i.e., either both will be ON or both will be OFF). 4 i 2 CMTratly OFF ° 7 

A page's data will be treated as inconsistent if the in teg- 10 ' 
rity bits are different. Every time a page is transferred to 

physical storage, the integrity bits are inverted The method of ensuring data integrity shown in 

(changed from OFF to ON, or from ON to OFF). If the Table 4 consists of two separate modules: one module 

page is only partially transferred, the integrity bits will for checking the integrity bits before reading data from 

not match. 15 a page, and a second module for setting the integrity bits 

As mentioned earlier, a read I/O (i.e., a page access when new data is written to a page. The first module, at 

and read) is not performed when a logically empty page lines 401-409, checks the integrity bits of pages 18 as 

caused by a mass delete is_used for an insert c^peration. they are about to be read (line 401). If the integrity bits 

Since no read I/O is performed, the values of the integ- at beginning and end of the page do not contain the 

rity bits for the physically stored page are not known. 20 sam£ value 0*°* P 8 ^ » marked as being damaged 

This leaves open the possibility that an inconsistent 0*°* 404 )« Damaged pages are returned to their last 

condition could not be detected if the reused page con- consistent state by recovering the page using the log 

taming the newly inserted data was only partially trans- records (line 405). If the page's integrity bits are consis- 

ferred to physical storage, since the new integrity bit at tcnt » nothing is done (lines 406-408) and the page may 

the head of the page might match the old, unrep laced 2* immediately be read normally, 

integrity bit at the end of the physically stored page. The v™** module of the data integrity method 

To avoid this possibility of undetectable inconsistent *1IM*2) resets the integrity bits whenever a page 

data, the NONZERO PHYSICAL PAGE log indica- » 800111 to transferred from a buffcr t0 physical stor- 

tor is included in the page format log record written ^ ^ 410) Before such a transfer the integrity bits 

when a mas*4eleted page is reformatted (line 329 of 30 » bl *£ e cu ™^ ON - *T C 

Table 3), The NONZERO PHYSICAL PAGE log < mvcc f ed > * 01 ^* * ^ m currently 

indicator indicates that the format log record must be °^ ^^verted to ON (line 401) 

reapplied during a system restart, replacing the contents With mctnod f ° r ™*"™S integrity m 

of the physically stored page with binary reroes. Data „ the segmented tabtespace 10, this mventon » methods 

base rnanagement systemTwhich use write-ahead log- 35 for nunimmng locking and reading of da* pages can be 

ging Z^yThccTtotamp value in both the page used to sigmrlcanUy reduce ^n***^****^ 

header and the log record whoever a page is updated! J*™ - during ^^IJ^^^^^X 

Without the NONZERO PHYSICAL PAGE logindJ. measurably improvmg the performance of the 

cator in the format ± log record, during a restart the base manapment system. 

Mowt ui uw luimwiiug ~^ T* ~* TV . j 40 It will be understood that, although specific embodi- 

sy«em would assume that the formerly ^deleted, 40 J^^^JSSSSo- have been d'escribed above 

and now updated, page was successfully transferred to P f various modificaliolB my 

pbyaeal storage even if the page was in feet only par- £STwU.out departing from the spirit and scope of 

tiaUy transferred, because the tunestamp in the physi- Forexample, this invention may beWd 

^ I™ ^^T P p°a op °! « rdifferendy crganizcd'scgmcntcd tablespaces which 

record. With the NONZERO PHYSICAL PAGE log nonethelcM Prevention's essential requirement 

indicator, the system will first set the pageto fafavy ^ a ^ 3tatus ^ stored separately from the 

reroes, and then reset it with the newly-mserted daUu ^ ^ ^ nnecessary Iocking ^ reading of the 

This method of ensunng data integrity m the segmented ™ ecanbe avoided . Furthermore, the invention is not 

tablapace 10 is based on the use of wnte-ahead logging, ^ {T*^ to re utional data base management systems, but 

m which the data base management system is forced to ^ ^ ^ m oth er types of computer- 

write the updated log record before (ahead of) transfer- ^ ^ processing systems. Accordingly, the scope of 

ring the updated page to physical storage. protection of this invention is limited only by the fol- 

A pseudocode implementation of this method for (owing rf«im« 

ensuring data integrity is shown in Table 4. JS Weclaim: 

TABLE 4 1. In a physical storage space managed by a comput- 

Pieuiocode for Earning Dau intefifty erized database management system including at least 

— — F.^wfcb^dfe.w^THEN one data storage device, the storage space consisting of 

/- cta*u» tatrg^w"-/ two or more pages, a method for accessing those pages 

402 do. 60 comprising the steps of: 

403 if the integrity chirking bits ire not consistent (a) storing by the data storage device on any page 
4M ktoktiu dunund data for no more than a single entity; 

403 iWertbe^froTK log records, (b) storing by the data storage device separately from 

406 else, /= the integrity bits ue consistent >/ each page an indication of the status of that page; 

407 do nothing. fij ( c ) determining the status of a selected one of the 
*j* |JJ°- pages based on the status indication thereof; and 

, (d) if the selected page's status is such that the se- 

410 if • page is to be transferred from ■ buffer to lected page contains only obsolete dau because of 
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a deletion of all of the data of the data entity stored 
on the selected page, accessing the selected page as 
if the selected page contains no data. 

2. A method for accessing pages according to claim 1, 
further including the step of: 3 

(e) if the selected page's status is such that the se- 
lected page contains only obsolete data because of 
a deletion of less than all of the data of the data 
entity stored on the selected page, locking the se* \q 
lected page, and then reading the selected page 
only if it then contains current data. 

3. A method for accessing pages according to claim l t 
further comprising the step of: 

(e) locking the pages of the data entity during said 15 
deletion of all of a data entity's data; 
and wherein the accessing step (d) comprises the steps 
of: 

(0 if the selected page's status is such that the se- ^ 
lected page contains at least some current data, 
locking and reading the selected page; 

(ii) if the selected page's status is such that the 
selected page contains only obsolete data be- 
cause of a deletion of less than all of the data of 25 
the data entity stored on he selected page, lock- 
ing the selected page and then reading the se- 
lected page only if the selected page then con- 
tains current data; 

(in) if the selected page's status is such that the 30 
selected page contains no data, skipping the se- 
lected page; and 

(iv) if the selected page's status is such that the 
selected page contains only obsolete data be- 35 
cause of a deletion of all of the data of the data 
entity stored on the selected page, skipping the 
selected page. 

4. A method for accessing pages according to claim 1, 
further comprising the step of: 40 



(e) locking the storage space during a deletion of all 
of the data entity's data; 
and wherein the step (d) comprises the steps of: 

(i) if the selected page's status is such that the se- 
lected page contains current data, reading the 
selected page before inserting data thereinto; 

(ii) if the selected page's status is such that the 
selected page contains only obsolete data be- 
cause of a deletion of less than all of the data of 
the data entity stored on the selected page, read- 
ing the selected page before inserting data there- 
into; 

(iii) if the selected page's status is such that the 
selected page contains no data, inserting data 
thereinto without first reading the selected page; 

(iv) if the selected page's status is such that the 
selected page contains only obsolete data be- 
cause of a deletion of all of the data of the data 
entity stored on the selected page, inserting data 
into the selected page without first reading the 
selected page. 

5. A method according to claim 1, wherein step (b) 
comprises the steps of: 

for each page which is entirely full of current data, 

storing a first status indication; 
for each page which is partially full of current data, 

storing a second status indication; 
for each page which contains no data* storing a third 

status indication; 
for each page which contains only obsolete data as a 

result of an unqualified deletion of data, storing a 

fourth status indication; and 
for each page which contains only obsolete data be- 
cause of a qualified deletion of data, storing a fifth 

status indication 

6. A method for accessing pages according to claim 1, 
wherein the database management system comprises a 
relational database management system, and wherein 
the storage space comprises a relational table space. 
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